Skip to content

Conversation

lexnv
Copy link
Contributor

@lexnv lexnv commented Jul 23, 2025

This is a proof of concept/poc for for sending notifications asynchronously.

There are several benefits to this:

  • notifications are sent in a background task using FuturesUnordered, allowing for parallel handling
  • No twicking of constants or increase memory usage
  • Uses natural backpressure, ideally every notif protocol wants this (ie, spamming disputes)

While this is still a PoC, there are a few areas that could be improved:

  • No timeout for sending transactions to a peer (average time for the scenario led to 130us during each async call, with a peak of 1ms)
  • Consider switching the interval timer to use Tokio interval
  • Make sure tokio::select! is handled safely to avoid panics
  • the transaction pool might not always have all transactions available
  • Improve error handling for cases when peers disconnect or async sends fail
  • Batching in substrate/client/service/src/builder.rs could be optimized (if further needed)

cc @lrubasze @skunert

Signed-off-by: Alexandru Vasile <[email protected]>
@lexnv lexnv changed the title poc: async backpressure for notifs poc: async backpressure for transaction notifications Jul 23, 2025
@lexnv lexnv requested a review from skunert August 6, 2025 11:17
github-merge-queue bot pushed a commit that referenced this pull request Oct 16, 2025
# Description

Fixes gossiping and scalability issues in the statement-store
networking.

1. Reduced gossiping traffic by propagating only recent statements
instead of all.
2. Added an early check for statements that the node already has to skip
duplicate processing.
3. Added splitting of large statement batches to stay under
MAX_STATEMENT_NOTIFICATION_SIZE; oversized individual statements are
skipped.
4. MAX_STATEMENT_NOTIFICATION_SIZE was updated to the commonly used 1MB,
which drastically improved the gossiping speed.
5. Notifications are sent asynchronously. I don't see much difference in
performance, but according to @lexnv, it's better to do:
#9296.
6. Added a 10s timeout to handle very slow or disconnected peers.

## Integration

Internal optimizations to the gossip protocol. No downstream changes
required.

Related PR: #9965

## Things to handle in further PRs
- After this PR, nodes don't send all statements to new peers anymore,
only the recent ones.
- After restarting, the node doesn't re-gossip statements it wasn't
gossiped.
- Broadcasting notifications to all peers when the first peer is slow is
limited. We could instead use a FuturesUnordered.

---------

Co-authored-by: cmd[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Bastian Köcher <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant